Second order optimality in Markov decision chains

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Second Order Optimality in Transient and Discounted Markov Decision Chains

Abstract. The article is devoted to second order optimality in Markov decision processes. Attention is primarily focused on the reward variance for discounted models and undiscounted transient models (i.e. where the spectral radius of the transition probability matrix is less then unity). Considering the second order optimality criteria means that in the class of policies maximizing (or minimiz...

متن کامل

On the spectral analysis of second-order Markov chains

Second order Markov chains which are trajectorially reversible are considered. Contrary to the reversibility notion for usual Markov chains, no symmetry property can be deduced for the corresponding transition operators. Nevertheless and even if they are not diagonalizable in general, we study some features of their spectral decompositions and in particular the behavior of the spectral gap unde...

متن کامل

Empirical Bayes Estimation in Nonstationary Markov chains

Estimation procedures for nonstationary Markov chains appear to be relatively sparse. This work introduces empirical  Bayes estimators  for the transition probability  matrix of a finite nonstationary  Markov chain. The data are assumed to be of  a panel study type in which each data set consists of a sequence of observations on N>=2 independent and identically dis...

متن کامل

Bias Optimality for Multichain Markov Decision Processes

In recent research we find that the policy iteration algorithm for Markov decision processes (MDPs) is a natural consequence of the performance difference formula that compares the difference of the performance of two different policies. In this paper, we extend this idea to the bias-optimal policy of MDPs. We first derive a formula that compares the biases of any two policies which have the sa...

متن کامل

Markov Chains and Mixing Times, second edition

Unlike most books reviewed in the Intelligencer this is definitely a textbook. It assumes knowledge one might acquire in the first two years of an undergraduate mathematics program – basic mathematical probability, plus linear algebra, a little graph theory and the infamous concept of “mathematical maturity”. It has the theorem-proof style of pure mathematics, but with friendly explanations of ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Kybernetika

سال: 2018

ISSN: 0023-5954,1805-949X

DOI: 10.14736/kyb-2017-6-1086